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We study the distributions of citations received by a single publication within several disciplines, 
spanning broad areas of science. We show that the probability that an article is cited c times has 
large variations between different disciplines, but all distributions are rescaled on a universal curve 
when the relative indicator c/ = c/co is considered, where co is the average number of citations per 
article for the discipline. In addition we show that the same universal behavior occurs when citation 
distributions of articles published in the same field, but in different years, are compared. These 
findings provide a strong validation of cj as an unbiased indicator for citation performance across 
disciplines and years. Based on this indicator, we introduce a generalization of the h-index suitable 
for comparing scientists working in different fields. 



I. INTRODUCTION 

Citation analysis is a bibliometric tool that is becom- 
ing increasingly popular to evaluate the performance of 
different actors in the academic and scientific arena, rang- 
ing from individual scholars [ESQ, to journals .depart- 
ments, universities [i| and national institutions Q up to 
whole countries Q . The outcome of such analysis often 
plays a crucial role to decide which grants are awarded, 
how applicants for a position are ranked, even the fate 
of scientific institutions. It is then crucial that citation 
analysis is carried out in the most precise and unbiased 
way. 

Citation analysis has a very long history and many po- 
tential problems have been identified @, H, , the most 
critical being that often a citation does not - nor it is 
intended to - reflect the scientific merit of the cited work 
(in terms of quality or relevance). Additional sources of 
bias are, to mention just a few, self-citations, implicit ci- 
tations, the increase in the total number of citations with 
time or the correlation between the number of authors of 
an article and the number of citations it receives [l(| . 

In this work we consider one of the most relevant fac- 
tors that may hamper a fair evaluation of scientific per- 
formance: field variation. Publications in certain disci- 
plines are typically cited much more or much less than 
in others. This may happen for several reasons, includ- 
ing uneven number of cited papers per article in differ- 
ent fields or unbalanced cross-discipline citations A 
paradigmatic example is provided by mathematics: the 
highest 2006 impact factor (IF) [l2j for journals in this 
category (Journal of the American Mathematical Soci- 
ety) is 2.55, whereas this figure is ten times larger or 
even more in other disciplines (for example, New Eng- 
land Journal of Medicine has 2006 IF 51.30, Cell has IF 
29.19, Nature and Science have IF 26.68 and 30.03, re- 
spectively). 

The existence of this bias is well-known @, [l(| HI] and 
it is widely recognized that comparing bare citation num- 



bers is inappropriate. Many methods have been proposed 
to alleviate this problem [ij, 0, [H [H, E3 ■ They are 
based on the general idea of normalizing citation numbers 
with respect to some properly chosen reference standard. 
The choice of a suitable reference standard, that can be 
a journal, all journals in a discipline or a more compli- 
cated set [3] is a delicate issue [Hj]. Many possibilities 
exist also in the detailed implementation of the standard- 
ization procedure. Some methods are based on ranking 
articles (scientists, research groups) within one field and 
comparing relative positions across disciplines. In many 
other cases relative indicators are defined, i.e. ratios be- 
tween the bare number of citations c and some average 
measure of the citation frequency in the reference stan- 
dard. A simple exam ple is the Relative Citation Rate 
of a group of articles [ljj], defined as the total number 
of citations they received, divided by the weighted sum 
of impact factors of the journals where the articles were 
published. 

The use of relative indicators is widespread, but empir- 
ical studies fl9l . l20l . |2~H have shown that distributions of 
article citations are very skewed, even within single dis- 
ciplines. One may wonder then whether it is appropriate 
to normalize by the average citation number, that gives 
only very limited characterization of the whole distribu- 
tion. We address this issue in this article. 

The problem of field variation affects the evaluation 
of performance at many possible levels of detail: pub- 
lications, individual scientists, research groups, institu- 
tions. Here we consider the simplest possible level, the 
evaluation of citation performance of single publications. 
When considering individuals or research groups, addi- 
tional sources of bias (and of arbitrariness) exist, that 
we do not tackle here. As reference standard for an arti- 
cle, we consider the set of all papers published in journals 
that are classified in the same Journal of Citation Report 
scientific category of the journal where the publication 
appears (see details in Sec. IVI|) . We take as normalizing 
quantity for citations of articles belonging to a given sci- 




Figure 1: Normalized histogram of the number of articles 
P(c, Co) published in 1999 and having received c citations. 
We plot P(c, Co) for several scientific disciplines with different 
average number Co of citations per article. 

cntific field the average number cq of citations received by 
all articles in that discipline published in the same year. 
We perform an empirical analysis of the distribution of ci- 
tations for publications in various disciplines and we show 
that the large variability in the number of bare citations 
c is fully accounted for when c/ = c/c is considered. 
The distribution of this relative performance index is the 
same for all fields. No matter whether, for instance, De- 
velopmental Biology, Nuclear Physics or Aerospace Engi- 
neering are considered, the chance of having a particular 
value of Cf is the same. Moreover, we show that Cf al- 
lows to properly take into account the differences, within 
a single discipline, between articles published in different 
years. This provides a strong validation of the use of ey- 
as an unbiased relative indicator of scientific impact for 
comparison across fields and years. 



II. VARIABILITY OF CITATION STATISTICS 
IN DIFFERENT DISCIPLINES 

First of all we show explicitly that the distribution of 
the number of articles published in some year and cited 
a certain number of times strongly depends on the disci- 
pline considered. In Fig. [1] we plot the normalized distri- 
butions of citations to articles that appeared in 1999 in 
all journals belonging to several different disciplines ac- 
cording to the Journal of Citation Reports classification. 

From this figure it is apparent that the chance of a 
publication to be cited strongly depends on the cate- 
gory the article belongs to. For example a publication 
with 100 citations is approximately 50 times more com- 
mon in "Developmental Biology" than in "Engineering, 
Aerospace" . This has obvious implications in the evalu- 
ation of outstanding scientific achievements: the simple 
count of the number of citations is patently misleading 
to assess whether an article in Developmental Biology is 



Figure 2: Rescaled probability distribution coP(c, Co) of the 
relative indicator c/ = c/co, showing that the universal scal- 
ing holds for all scientific disciplines considered (see table [TJ . 
The dashed line is a lognormal fit with a 2 = 1.3. 

more successful than one in Aerospace Engineering. 

III. DISTRIBUTION OF THE RELATIVE 
INDICATOR c f 

A first step toward properly taking into account field 
variations is to recognize that the differences in the bare 
citation distributions are essentially not due to specific 
discipline-dependent factors, but are instead related to 
the pattern of citations in the field, as measured by the 
average number of citations per article Co- It is natural 
then to try to factor out the bias induced by the differ- 
ence in the value of cq by considering a relative indicator, 
i.e. measuring the success of a publication by the ratio 
Cf = c/cq between the number of citations received and 
the average number of citations received by articles pub- 
lished in its field in the same year. Fig. [2] shows that this 
procedure leads to a very good collapse of all curves for 
different values of Co onto a single shape. The distribution 
of the relative indicator c/ seems then universal for all 
categories considered and resembles a lognormal distri- 
bution. In order to make these observations more quan- 
titative, we have fitted each curve in Fig. [5] for Cf > 0.1 
with a lognormal curve 

F( C f)= 1 e -(^(c S )-^/2a^ (1) 

(7C/ V27T 

where the relation a 2 = — 2/i, due to the fact that the 
expected value of the variable c/ is 1, reduces the num- 
ber of fitting parameters to one. All fitted values of a 2 , 
reported in Table [IJ are compatible within two standard 
deviations, except for one (Anesthesiology) that is in any 
case within three standard deviations of all the others. 
Values of % 2 per degree of freedom, also reported in Ta- 
ble[Tl indicate that the fit is good. This allows to conclude 
that, rescaling the distribution of citations for publica- 
tions in a scientific discipline by their average number, a 
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Index 


Subject Category 


year 


N p 


CO 


Cm ax 


a 2 


X 2 /df 


1 


Agricultural Economics & Policy 


1999 


266 


6.88 


42 


1.0(1) 


0.007 


2 


Allergy 


1999 


1530 


17.39 


271 


1.4(2) 


0.012 


3 


Anesthesiology 


1999 


3472 


13.25 


282 


1.8(2) 


0.009 


4 


Astronomy & Astrophysics 


1999 


7399 


23.77 


1028 


1.1(1) 


0.003 


5 


Biology 


1999 


3400 


14.6 


413 


1.3(1) 


0.004 


6 


Computer Science, Cybernetics 


1999 


704 


8.49 


100 


1.3(1) 


0.004 


7 


Developmental Biology 


1999 


2982 


38.67 


520 


1.3(3) 


0.002 


8 


Engineering, Aerospace 


1999 


1070 


5.65 


95 


1.4(1) 


0.003 


9 


Hematology 


1990 


4423 


41.05 


1424 


1.5(1) 


0.002 


10 


Hematology 


1999 


6920 


30.61 


966 


1.3(1) 


0.004 


11 


Hematology 


2004 


8695 


15.66 


1014 


1.3(1) 


0.003 


12 


Mathematics 


1999 


8440 


5.97 


191 


1.3(4) 


0.001 


13 


Microbiology 


1999 


9761 


21.54 


803 


1.0(1) 


0.005 


14 


Neuroimaging 


1990 


444 


25.26 


518 


1.1(1) 


0.004 


15 


Neuroimaging 


1999 


1073 


23.16 


463 


1.4(1) 


0.003 


16 


Neuroimaging 


2004 


1395 


12.68 


132 


1.1(1) 


0.005 


17 


Physics, Nuclear 


1990 


3670 


13.75 


387 


1.4(1) 


0.001 


18 


Physics, Nuclear 


1999 


3965 


10.92 


434 


1.4(4) 


0.001 


19 


Physics, Nuclear 


2004 


4164 


6.94 


218 


1.4(1) 


0.001 


20 


Tropical Medicine 


1999 


1038 


12.35 


126 


1.1(1) 


0.017 



Table I: List of all scientific disciplines considered in this article. For each category we report the total number of articles N p , 
the average number of citations Co, the maximum number of citations c max , the value of the fitting parameter a 2 in Eq. (fl| and 
the corresponding \ 2 P er degree of freedom. Data refer to articles published in journals listed by Journal of Citation Reports 
under a specific subject category. 



universal curve is found, independent of the specific dis- 
cipline. Fitting a single curve for all categories, a lognor- 
mal distribution with a 1 = 1.3 is found, that is reported 
in Figure [2j 

Interestingly, a similar universality for the distribution 
of the relative performance is found, in a totally different 
context, when the number of votes received by candidates 
in proportional elections is considered [l^j. In that case, 
the scaling curve is also well-fitted by a lognormal with 
parameter a 1 w 1.1. For universality in the dynamics of 
academic research activities see also [23j |. 

The universal scaling obtained provides a solid ground- 
ing for comparison between articles in different fields. To 
make this even more visually evident, we have ranked all 
articles belonging to a pool of different disciplines (span- 
ning broad areas of science) according either to c and to 
Cf. We have then computed the percentage of publica- 
tions of each discipline that appear in the top z% of the 
global rank. If the ranking is fair the percentage for each 
discipline should be around z% with small fluctuations. 
Fig.[3]clearly shows that when articles are ranked accord- 
ing to the unnormalized number of citations c there are 
wide variations among disciplines. Such variations are 
dramatically reduced instead when the relative indicator 
Cf is used. This occurs for various choices of the percent- 
age z. More quantitatively, assuming that articles of the 
various disciplines are scattered uniformly along the rank 



axis, one would expect the average bin height in Fig. [3] 
to be z% with a standard deviation 



z(lOO-z) ^ 1 

where N c is the number of categories and Ni the number 
of articles in the z-th category. When the ranking is per- 
formed according to Cf = c/cq we find (Table Hi*]) a very 
good agreement with the hypothesis that the ranking is 
unbiased, while strong evidence that the ranking is bi- 
ased is found when c is used. For example, for z = 20%, 
a z = 1.15% for c/-based ranking, while a z = 12.37% if 
c is used, as opposed to the value a z = 1.09% in the 
hypothesis of unbiased ranking. Figures [5] and [3] allow 
to conclude that c/ is an unbiased indicator for compar- 
ing the scientific impact of publications in different dis- 
ciplines. For the normalization of the relative indicator, 
we have considered the average number cq of citations 
per article published in the same year and in the same 
field. This is a very natural choice, giving to the nu- 
merical value of Cf the direct interpretation as relative 
citation performance of the publication. In the literature 
this quantity is also indicated as the "item oriented field 
normalized citation score" [24], an analogue for a sin- 
gle publication of the popular CWTS (Centre for Science 
and Technology Studies, Leiden) field normalized cita- 
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Figure 3: We rank all articles according to the bare number 
of citations c and the relative indicator c/. We then plot the 
percentage of articles of a particular discipline present in the 
top z% of the general ranking, for the rank based on the num- 
ber of citations ( histograms on the left in each panel) and 
based on the relative indicator c/ (histograms on the right). 
Different values of z (different panels) lead to very similar 
pattern of results. The average values and the standard devi- 
ations of the bin heights shown are also reported in Table HTl 
The numbers identify the disciplines as they are indicated in 
Table U 



z 


a z (theor) 


z(c) 


O-z(c) 






5 


0.59 


4.38 


4.73 


5.14 


0.51 


10 


0.81 


8.69 


7.92 


10.07 


0.67 


20 


1.09 


17.68 


12.37 


20.03 


1.15 


40 


1.33 


35.67 


17.48 


39.86 


2.58 



Table II: Average and standard deviation for the bin heights in 
Fig. El Comparison between the values expected theoretically 
for unbiased ranking (first two columns), and those obtained 
empirically when articles are ranked according to c (third and 
fourth columns) and according to c/ (last two columns). 



tion score or "crown indicator" |25|]. In agreement with 
the findings of Ref. [ll[ cq shows very little correlation 
with the overall size of the field, as measured by the total 
number of articles. 

The previous analysis compares distributions of cita- 
tions to articles published in a single year, 1999. It is 
known that different temporal patterns of citations ex- 
ist, with some articles starting soon to receive citations, 
while others ( "sleeping beauties" ) go unnoticed for a long 
time, after which they are recognized as seminal and be- 
gin to attract a large number of citations [26|, [27} ■ Other 
differences exist between disciplines, with noticeable fluc- 
tuations in the cited half- life indicator across fields. It is 
then natural to wonder whether the universality of dis- 
tributions for articles published in the same year extends 
longitudinally in time so that the relative indicator allows 
comparison of articles published in different years. For 
this reason, in Fig. [4] we compare the plot of cqP(c, cq) 
vs Cf for publications in the same scientific discipline ap- 
peared in three different years. The value of Co obvi- 
ously grows as older publications are considered, but the 
rescaled distribution remains conspicuously the same. 



IV. A GENERALIZED H-INDEX 

Since its introduction in 2005, the h- index [l| has en- 
joyed a spectacularly quick success [H)]: it is now a well 
established standard tool for the evaluation of the scien- 
tific performance of scientists. Its popularity is partly due 
to its simplicity: the h-index of an author is h if h of his 
N articles have at least h citations each, and the other 
N — h articles have at most h citations each. Despite 
its success, as all other performance metrics the h-index 
has some shortcomings, as already pointed out by Hirsch 
himself. One of them is the difficulty to compare authors 
in different disciplines. The identification of the relative 
indicator c/ as the correct metrics to compare articles in 
different disciplines naturally suggests its use in a gener- 
alized version of the h-index taking properly into account 
different citation patterns across disciplines. However, 
just ranking articles according to c/, instead of on the 
basis of the bare citation number c, is not enough. A 
crucial ingredient of the h-index is the number of articles 
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Figure 4: Rescaled probability distribution coP(c, co) of the 
relative indicator c/ = c/co for three disciplines ("Hematol- 
ogy", "Neuroimaging", and "Physics, Nuclear") for articles 
published in different years (1990, 1999 and 2004). In spite 
of the natural variation of Co (co grows as a function of the 
elapsed time), the universal scaling observed over different 
disciplines naturally holds also for articles published in differ- 
ent periods of time. The dashed line is a lognormal fit with 
a 2 
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Figure 5: Inset: distributions of the number of articles N 
published by an author during 1999 in several disciplines. 
Main: the same distributions rescaled by the average num- 
ber No of publications per author in 1999 in the different 
disciplines. The dashed line is a power-law with exponent 
-3.5. 



published by an author. As Fig. [5] shows, such a quantity 
also depends on the discipline considered: in some dis- 
ciplines the average number of articles published by an 
author in a year is much larger than in others. But also 
in this case this variability is rescaled away if the number 
N of publications in a year by an author is divided by 
the average value in the discipline iVo- Interestingly, the 
universal curve is fitted reasonably well over almost two 
decades by a power-law behavior P(N,Nq) r* (N/Nq)~ 5 
with S = 3.5(5). 



This universality allows one to define a generalized h- 
index, hf, that factors out also the additional bias due 
to different publication rates, thus allowing comparisons 
among scientists working in different fields. To compute 
the index for an author, his/her articles are ordered ac- 
cording to c/ = c/co and this value is plotted versus 
the reduced rank r/No with r being the rank. In anal- 
ogy with the original definition by Hirsch, the general- 
ized index is then given by the last value of r/No such 
that the corresponding c/ is larger than r/No- For in- 
stance, if an author has published 6 articles with val- 
ues of Cf equal to 4.1, 2.8, 2.2, 1.6, 0.8 and 0.4 respec- 
tively, and the value of N in his discipline is 2.0, his 
hf -index is equal to 1.5. This because the third best ar- 
ticle has r/No = 1.5 < 2.2 = c/, while the fourth has 
r/No — 2.0 > 1.6 = Cf. We plan to present the results 
of the application of this generalized index to practical 
cases in a forthcoming publication. 



V. CONCLUSIONS 

In this article we have presented strong empirical ev- 
idence that the widely scattered distributions of cita- 
tions for publications in different scientific disciplines are 
rescaled upon the same universal curve when the rela- 
tive indicator Cf is used. We have also seen that the 
universal curve is remarkably stable over the years. The 
analysis presented here justifies the use of relative in- 
dicators to compare in a fair manner the impact of ar- 
ticles across different disciplines and years. This may 
have strong and unexpected implications. For instance, 
Figure [2] leads to the counterintuitive conclusion that an 
article in Aerospace Engineering with only 20 citations 
(cf sa 3.54) is more successful than an article in Develop- 
mental Biology with 100 citations (cf « 2.58). We stress 
that this does not imply that the article with larger c/ 
is necessarily more "important" than the other. In an 
evaluation of importance, other field-related factors may 
play a role: an article with an outstanding value of c/ in 
a very narrow specialistic field may be less "important" 
(for science in general or for the society) than a publi- 
cation with smaller c/ in a highly competitive discipline 
with potential implications in many areas. 

Since we consider single publications, the smallest pos- 
sible entities whose scientific impact can be measured, 
our results must always be taken into account when tack- 
ling other, more complicated tasks, like the evaluation of 
performance of individuals or research groups. For ex- 
ample, in situations where the simple count of the mean 
number of citations per publication is deemed to be im- 
portant, one should compute the average of c/ (and not 
of c) to evaluate impact independently of the scientific 
discipline. For what concerns the assessment of single 
authors' performance we have defined a generalized h- 
index [l[ that allows a fair comparison across disciplines 
taking into account also the different publication rates. 

Our analysis deals with two of the main sources of 
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bias affecting comparisons of publication citations, ft 
would be interesting to tackle, along the same lines, other 
potential sources of bias, as for example the number of 
authors, that is known to correlate with higher number of 
citations [To| . It is natural to define a relative indicator, 
the number of citations per author. Is this normalization 
the correct one that leads to a universal distribution, for 
any number of authors? 

Finally, from a more theoretical point of view, an in- 
teresting goal for future work is to understand the origin 
of the universality found and how its precise functional 
form comes about. An attempt to investigate what mech- 
anisms are relevant for understanding citation distribu- 
tions is in Ref. [2!| . Further activity in the same direction 
would be definitely interesting. 

VI. METHODS 

Our empirical analysis is based on data from 
Thomson Scientific's Web of Science (WOS, 
www.isiknowledge.com) database, where the num- 
ber of citations is counted as the total number of times 
an article appears as a reference of a more recent 
published article. Scientific journals are divided in 172 
categories, from "Acoustics" to "Zoology". Within a 
single category a list of journals is provided. We consider 
articles published in each of these journals to be part 
of the category. Notice that the division in categories 
is not mutually exclusive: for example Physical Review 
D belongs both to the "Astronomy & Astrophysics" 
and to the "Physics, particles & fields" categories. For 



consistency, among all records contained in the database 
we consider only those classified as "article" and "letter" , 
thus excluding reviews, editorials, comments and other 
published material likely to have an uncommon citation 
pattern. A list of the categories considered, with the 
relevant parameters that characterize them, is reported 
in Table Q] The category " Multidisciplinary sciences" 
does not fit perfectly into the universal picture found 
for other categories, because the distribution of the 
number of citations is a convolution of the distributions 
corresponding to the single disciplines represented in the 
journals. However, if one restricts only to the three most 
important multidisciplinary journals (Nature, Science, 
Proc. Natl. Acad. Sci. USA) also this category fits very 
well into the global universal picture. 

Our calculations neglect uncited articles; we have ver- 
ified however that their inclusion just produces a small 
shift in co, which does not affect the results of our anal- 
ysis. In the plots of the citation distributions, data have 
been grouped in bins of exponentially growing size, so 
that they are equally spaced along a logarithmic axis. 
For each bin, we count the number of articles with cita- 
tion count within the bin and divide by the number of 
all potential values for the citation count that fall in the 
bin (i.e. all integers). This holds as well for the distri- 
bution of the normalized citation count c/, as the latter 
is just determined by dividing the citation count by the 
constant Co, so it is a discrete variable just like the origi- 
nal citation count. The resulting ratios obtained for each 
bin are finally divided by the total number of articles 
considered, so that the histograms are normalized to 1. 
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